Font Descriptor Construction for Printed Thai Character Recognition

نویسندگان

  • Ungsumalee Suttapakti
  • Kuntpong Woraratpanya
  • Kitsuchart Pasupa
چکیده

The font evolution with various types is a great impact on a recognition performance of optical character recognition (OCR) systems. The more diversity of fonts leads to the less accuracy of recognition rate, particularly Thai-fonts. In order to overcome this obstacle, this paper proposes a font descriptor for printed Thai-character recognition. The role of such a descriptor is a representative of various fonts and sizes. The font descriptor construction is based on principal component analysis (PCA) in a combination with predefined patterns in multi-level processing. The proposed font descriptor is tested on Thai character image corpus consisting of consonants, vowels, and tones. The experimental results show that the proposed font descriptor is efficient and robust to font type and size variations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Modfied Self-organizing Map Neural Network to Recognize Multi-font Printed Persian Numerals (RESEARCH NOTE)

This paper proposes a new method to distinguish the printed digits, regardless of font and size, using neural networks.Unlike our proposed method, existing neural network based techniques are only able to recognize the trained fonts. These methods need a large database containing digits in various fonts. New fonts are often introduced to the public, which may not be truly recognized by the Opti...

متن کامل

Cryptogram Decoding for Optical Character Recognition

Optical character recognition (OCR) systems for machine-printed documents typically require large numbers of font styles and character models to work well. When given a document printed in an unseen font, the performance of those systems degrade even in the absence of noise. In this paper, we perform OCR in an unsupervised fashion without using any character models by using a cryptogram decodin...

متن کامل

Multi-font Optical Character Recognition System for Printed Telugu Text

The Telugu OCR systems available in the market currently recognize only the specific fonts of Telugu. This paper describes the development of a multi-font OCR system for printed Telugu characters using Artificial Neural Networks. In this system classification of the characters is carried out using multi layer neural network Architecture.

متن کامل

A Prototype of Multi-Font Printed Chinese Character Reader

An approach to multi-font printed Chinese character recognition is proposed in this paper. The problems of inputting image of characters, preprocessing, character segmentati~n~feature extraction as well as character classification have been discussed. According to the characteristics of multi-font printed Chinese characters,the number of cutting across strokes, the external and internal areas w...

متن کامل

Multi-feature Extraction for Printed Thai Character Recognition

This paper presents a simplified printed Thai character recognition system using multiple feature extraction and character classification. Three relevant information extracted from a set of training character images are the direction of each character’s contour, the density of character body and character peripheral information. This set of features is used as reference for classifying unknown ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013